Alleviating Linear Ecological Bias and Optimal Design with Sub-sample Data.
نویسندگان
چکیده
In this paper, we illustrate that combining ecological data with subsample data in situations in which a linear model is appropriate provides three main benefits. First, by including the individual level subsample data, the biases associated with linear ecological inference can be eliminated. Second, by supplementing the subsample data with ecological data, the information about parameters will be increased. Third, we can use readily available ecological data to design optimal subsampling schemes, so as to further increase the information about parameters. We present an application of this methodology to the classic problem of estimating the effect of a college degree on wages. We show that combining ecological data with subsample data provides precise estimates of this value, and that optimal subsampling schemes (conditional on the ecological data) can provide good precision with only a fraction of the observations.
منابع مشابه
Alleviating Ecological Bias in Voter Turnout Models (and other Generalized Linear Models) with Optimal Subsample Design
In this paper, we illustrate that combining ecological data with subsample data in situations in which a generalized linear model (GLM) is appropriate provides two main benefits. First, by including the individual level subsample data, the biases associated with ecological inference in GLMs can be eliminated. Second, available ecological data can be used to design optimal subsampling schemes, s...
متن کاملAlleviating Ecological Bias in Poisson Models using Optimal Subsampling: The Effects of Jim Crow on Black Illiteracy in the Robinson Data
In many situations data are available at the group level but one wishes to estimate the individual-level association between a response and an explanatory variable. Unfortunately this endeavor is fraught with difficulties because of the ecological level of the data. The only reliable solution to such ecological inference problems is to supplement the ecological data with individual-level data. ...
متن کاملTHE COMPARISON OF TWO METHOD NONPARAMETRIC APPROACH ON SMALL AREA ESTIMATION (CASE: APPROACH WITH KERNEL METHODS AND LOCAL POLYNOMIAL REGRESSION)
Small Area estimation is a technique used to estimate parameters of subpopulations with small sample sizes. Small area estimation is needed in obtaining information on a small area, such as sub-district or village. Generally, in some cases, small area estimation uses parametric modeling. But in fact, a lot of models have no linear relationship between the small area average and the covariat...
متن کاملStrategies for monitoring and evaluation of resource-limited national antiretroviral therapy programs: the two-phase design
BACKGROUND In resource-limited settings, monitoring and evaluation (M&E) of antiretroviral treatment (ART) programs often relies on aggregated facility-level data. Such data are limited, however, because of the potential for ecological bias, although collecting detailed patient-level data is often prohibitively expensive. To resolve this dilemma, we propose the use of the two-phase design. Spec...
متن کاملUsing Inverse Probability Bootstrap Sampling to Eliminate Sample Induced Bias in Model Based Analysis of Unequal Probability Samples
In ecology, as in other research fields, efficient sampling for population estimation often drives sample designs toward unequal probability sampling, such as in stratified sampling. Design based statistical analysis tools are appropriate for seamless integration of sample design into the statistical analysis. However, it is also common and necessary, after a sampling design has been implemente...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Journal of the Royal Statistical Society. Series A,
دوره 171 1 شماره
صفحات -
تاریخ انتشار 2008